Non-Monotonic Parsing of Fluent Umm I mean Disfluent Sentences
نویسندگان
چکیده
Parsing disfluent sentences is a challenging task which involves detecting disfluencies as well as identifying the syntactic structure of the sentence. While there have been several studies recently into solely detecting disfluencies at a high performance level, there has been relatively little work into joint parsing and disfluency detection that has reached that state-ofthe-art performance in disfluency detection. We improve upon recent work in this joint task through the use of novel features and learning cascades to produce a model which performs at 82.6 F-score. It outperforms the previous best in disfluency detection on two different evaluations.
منابع مشابه
Improving Translation Fluency with Search-Based Decoding and a Monolingual Statistical Machine Translation Model for Automatic Post-Editing
The BLEU scores and translation fluency for the current state-of-the-art SMT systems based on IBM models are still too low for publication purposes. The major issue is that stochastically generated sentences hypotheses, produced through a stack decoding process, may not strictly follow the natural target language grammar, since the decoding process is directed by a highly simplified translation...
متن کاملOn the generation of synthetic disfluent speech: local prosodic modifications caused by the insertion of editing terms
Disfluent speech synthesis is necessary in some applications such as automatic film dubbing or spoken translation. This paper presents a model for the generation of synthetic disfluent speech based on inserting each element of a disfluency in a context where they can be considered fluent. Prosody obtained by the application of standard techniques on these new sentences is used for the synthesis...
متن کاملA Unified Syntactic Model for Parsing Fluent and Disfluent Speech
This paper describes a syntactic representation for modeling speech repairs. This representation makes use of a right corner transform of syntax trees to produce a tree representation in which speech repairs require very few special syntax rules, making better use of training data. PCFGs trained on syntax trees using this model achieve high accuracy on the standard Switchboard parsing task.
متن کاملToddlers are sensitive to prosodic correlates of disfluency in spontaneous speech
The ability to distinguish fluent from disfluent speech could play an important role in infants’ acquisition of their first language. Across two experiments using a Headturn Preference Procedure, we show that infants are able to distinguish fluent from disfluent speech based on its prosodic characteristics, and show a preference for listening to fluent English. In the first experiment, 22-month...
متن کاملExploring Features for Identifying Edited Regions in Disfluent Sentences
This paper describes our effort on the task of edited region identification for parsing disfluent sentences in the Switchboard corpus. We focus our attention on exploring feature spaces and selecting good features and start with analyzing the distributions of the edited regions and their components in the targeted corpus. We explore new feature spaces of a partof-speech (POS) hierarchy and rela...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014